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INTRODUCTION 


'Inequalities  play  a  fundamental  role  in  nearly  all  branches  of  mathe¬ 
matics  --  especially  so  in  probability  and  statistics.  The  impact  of  basic 
inequalities  such  as  those  that  carry  the  names  of  Cauchy-Schwarz,  Chebyshev, 
Cramer-Rao,  and  Bonferroni  in  statistics  is  well  known.  Inequalities  have 
been  profitably  used  to  obtain  bounds  for  probabilities  that  are  more  tedious 
to  compute  or  analytically  impossible  to  handle.  Especially  in  reliability 
problems,  the  limited  assumptions  that  could  be  made  about  the  nature  of  the 
life  distributions  of  the  components  of  a  system  as  well  as  the  structure  of 
the  system  itself  render  inequalities  not  merely  useful  and  desirable  but 
essential.  Since  interest  in  inequalities  pervades  through  nearly  all 
branches  of  mathematics,  significant  contributions  have  been  made  by  a  very 
large  number  of  researchers  whose  efforts  span  well  over  a  century.  From 
time  to  time,  books  and  monographs  have  been  written  which  are  completely 

devoted  to  inequal i ties.^  The  classic  book  of  Hardy,  Littlewood  and  Polya  [35], 

/  , 

first  published  in  1934,  is  a  remarkable  collection  of  mathematical  inequalities. 
Some  important  works  that  followed  are  Beckenbach  and  Bellman  [12],  Godwin  [20], 
Kazarinoff  [40],  Marshall  and  01  kin  [47],  Mitrinovic  [49],  [50],  Polya  and 
Szego  [54],  Shisha[57],  and  Tong  [59].  Of  these,  the  monographs  of  Marshall 
and  01  kin  [47]  and  Tong  [59]  contain  the  recent  developments  in  the  area  of 


*This  research  was  supported  by  the  Office  of  Naval  Research  Contract 
N00014-75-C-0455  at  Purdue  University.  Reproduction  in  whole  or  in  part  is 
permitted  for  any  purpose  of  the  United  States  Government. 
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multivariate  probability  inequalities;  this  topic  has  seen  a  major  growth  in 
the  last  ten  or  fifteen  years.  In  this  connection  we  also  refer  to  a  recent 
review  paper  by  Eaton  [19]. 

In  selection  and  ranking  problem,  inequalities  and  monotonicity  properties 
have  a  vital  role  to  play.  Consider  the  classical  formulations  of  these 
problems  in  which  one  proposes  a  procedure  which  will  guarantee  a  minimum 
probability  of  correct  selection  (PCS).  This  amounts  to  evaluating  the  PCS, 
determining  the  parametric  configuration  for  which  the  PCS  is  minimum,  and 
then  determine  the  constants  defining  the  procedure  so  that  this  'Tiinimum  is 
at  least  a  specified  level  P*.  Determining  this  configuration,  known  as  a 
least  favorable  configuration  (LFC) ,  is  a  vital  part  of  the  analysis.  There 
are  a  number  of  problems  in  which  the  LFC  cannot  be  analytically  established; 
in  such  cases,  recourse  has  been  taken  to  obtain  a  good  lower  bound  for  the 
PCS  first  and  then  seek  the  LFC  for  this  lower  bound.  Even  when  the  LFC  for 
the  PCS  can  be  analytically  established,  inequalities  are  useful  in  obtaining 
conservative  but  easier- to-compute  values  for  the  constants  of  the  procedure. 
Similar  situations  arise  when  we  consider  the  worst  configuration  for  any 
suitable  performance  characteristic  such  as  the  expected  number  of  nonbest 
populations  included  in  the  selected  subset.  Additional  uses  of  inequalities 
arise  due  to  specific  assumptions  regarding  the  families  of  distributions 
under  consideration;  for  example,  distributions  having  an  increasing  failure 
rate  (IFR)  and  increasing  failure  rate  average  (IFRA).  For  a  general  view 
of  selection  and  ranking  problems  and  the  various  formulations  and  goals 
that  have  been  studied,  we  refer  to  Gupta  and  Panchapakesan  [31]. 

In  this  paper,  we  restrict  our  attention  mainly  to  some  inequlities  and 
monotonicity  properties  that  have  typically  arisen  in  the  development  of  the 
selection  and  ranking  theory.  Basic  to  the  setup  of  these  problems  is  the 
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assumption  regarding  some  order  relations  such  as  stochastic  ordering  and  the 
monotone  likelihood  property.  These  and  other  related  ideas,  along  with 
some  basic  inequalities  that  arise  under  these  assumptions  are  discussed  in 
Section  2.  In  reliability  models,  partial  order  relations  such  as  convex 
ordering,  star  ordering  and  tail  ordering  play  an  important  role.  Section  3 
deals  with  restricted  families  of  distributions  defined  by  such  partial  order 
relations  and  some  important  inequalities  obtained  in  the  investigation  of 
selection  problems  for  such  families.  Interesting  inequalities  appear  in  the 
study  of  selection  rules  for  normal,  multinomial  and  gamma  distributions. 

These  are  discussed  in  Section  4. 

2.  ORDERED  FAMILIES  OF  DISTRIBUTIONS 

Inherent  to  a  selection  and  ranking  problem  is  the  choice  of  a  ranking 

parameter,  say,  e.  The  natural  setup  consists  of  k  populations  that  are 

described  by  their  associated  probability  distributions  PQ  ,  i  =  1 . k, 

ei 

where  e^Q  ,  a  subset  of  the  real  line.  In  other  words,  these  populations 
belong  to  a  family  p  =  {P0 >  indexed  by  e t  n  .  A  reasonable  procedure  can  be 
proposed  if  we  have  some  knowledge  of  the  structural  properties  of  this 
family.  For  example,  if  Xj ,  ...,  are  observations  from  the  k  populations, 
we  would  like  to  say  that  large  values  of  X  generally  qo  witr  large  values 
of  e.  Such  statements  bring  in  order  relations  for  distributions  belonging 
to  the  family.  We  will  now  formalize  such  concepts  and  state  some  monotonicity 
results. 

2.1.  Stochastic  Ordering  and  Monotone  Likelihood  Ratio  Property. 

Let  X  be  a  real  valued  random  variable  with  distribution  P.,  e€  n.  Then  the 

y 

family  P  =  {PQ} ,  e  t  q  ,  is  said  to  be  stochastically  increasing  (SI)  in  e  if 
for  e,  <  e9,  the  distributions  P.  and  P  are  distinct,  and  for  any  real  number  a, 

\  C.  0  1  y  r\ 


4 


(2.1) 


Pfl  [X€  (a,»  )]  <  Pfi  [X  €  (a,  -  )]. 
el  e2 


It  is  well  known  that  a  stronger  property  is  that  of  monotone  likelihood 
ratio  (MLR)  introduced  by  Karlin  and  Rubin  [39]and  this  is  equivalent  to  the 
frequency  function  having  total  positivity  of  order  2  (TP^).  The  concept  of 
total  positivity  is,  however,  more  general  and  is  not  restricted  to  frequency 
functions  (see  Karlin  [38]). 

A  basic  result  of  Lehmann  ([44],  p.  112,  Problem  11)  can  be  stated 
as  follows. 


Theorem  2.1.  Let  {P„},  6  6  n,  be  an  SI  family  of  distributions  and  let 

0 

\p(x)  be  a  real  valued  function  nondecreasing  in  x.  Then  En[^(X)]  is  non- 
dreasing  in  e. 

A  straight  forward  generalization  of  this  theorem  independently 
obtained  by  Alam  and  Rizvi  [4]  and  Mahamunulu  [46] is  given  below. 

Theorem  2.2.  Let  {P„},  e €  ft,  be  an  SI  family  of  distributions.  Let 

0 


X1 ,  ...,  Xk  be  independent  ramdom  variables,  X..  having  the  distribution 
P.  ,  0.  €  i  =  1,  ...,  k.  Then  E.<J<{X, ,  ...,  X.)  is  nondecreasinq  in  each 

0^1  0_  »  K 

component  of  0  =  (e-j,  ....  ek)  if  ^(x^,  ...,  xk)  is  nondecreasing  in  each 
of  its  arguments. 

Theorem  2.2  has  been  successfully  applied  to  many  selection  problems. 
For  suitably  chosen  ^(  x^ ,  ...,  xk),  the  expectation  E0ip(X^ ,. . . ,  Xk)  becomes 
the  PCS.  The  monotonicity  property  of  the  expectation  enables  one  to 
obtain  the  LFC. 

Another  generalization  of  Theorem  2.1  in  a  different  direction  is  due 
to  Gupta  and  Panchapakesan  [28]  who  considered  a  class  of  subset  selection 
rules  defined  through  a  class  of  functions  h.  For  evaluating  the  infimum 
of  the  PCS,  we  need  to  minimize  over  e  the  expectation  E^  [4^ ( X ,0  ) ] .  The 
following  theorem  of  Gupta  and  Panchapakesan  [28]  gives  a  sufficient  condition 
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for  the  monotonicity  of  E()[ip(X,o)]. 

Theorem  2.3.  Let  F ( • ; o ) ,  ot  u,  be  a  family  of  absolutely  continuous 
distributions  on  the  real  line  F  with  continuous  densities  f ( * ; G )  and  let 
tp(  x ,  0 )  be  a  bounded  real  valued  function  possessing  first  partial  derivatives 
<p  and  <pa  with  respect  to  x  and  0,  respectively,  and  satisfying  certain 

regularity  conditions  C.  Then  Ea[ip(X,0)]  is  nondecreasinq  in  0  provided 

0 

that  for  all  9  fc  ft, 

(2.2)  f(x;e)ipQ(x,e)  -  iPx(x,e)  >_  0  a.e.x, 

where  the  regularity  conditions  C  are: 

(i)  for  all  etsj.^x.e)  is  Lebesgue  integrable  onIR;  and 
(ii)  for  every  ^  ,e2]  c:  ft  and  03€ft,  there  exists  g(x)  depending  only 
on  0.j ,  0^ ,  0^  such  that 

I  *Q(x,e)f(x;e3)  -  3F-^;6^  ^x(x»e3)  I  19W 
for  all  e€[e1,e2]  and  g(x)  is  Lebesgue  integrable  on  IR. 

Remark  2.4  (1)  If  ip(x,e)  =  ip(x)  for  all  0£ft,  the  sufficient  con¬ 
dition  (2.2)  reduces  to  ip  (x)  <  0,  which  is  satisfied  by  the  hypotheses 

du  X 

of  Theorem  2.1  since  {Fa}  is  SI  and  ip(x)  is  nondecreasing  in  x. 

(2)  For  the  class  of  procedures  defined  by  Gupta  and  Panchapakesan  [28], 

<p(x,e)  =  F(h(x);e)  and  (2.2)  becomes 

(2.3)  f(x;e)  aF^|x);e)  -  h' (x)  f(h(x);e)  a-F<*;9)-  >  0 
where  h'(x)  =  (d/dx)  h(x). 

(3)  This  condition  has  been  specialized  to  the  cases  of  (i)  location  parameter, 
(ii)  scale  parameter,  and  (iii)  convex  mixtures  of  distributions  by  Gupta 


and  Panchapakesan  for  the  purposes  of  specific  applications. 


(4)  An  analogue  of  this  theorem  for  discrete  distributions  is  given  by 
Panchapakesan  [52],  who  has  given  in  another  paper  [53]  sufficient  conditions 
for  monotonicity  when  fl  is  a  countable  set. 

(5)  The  monotonicity  of  E.[i^(x,e)]  in  e  is  strict  if  strict  inequality  holds 

0 

in  (2.3)  on  a  set  of  positive  Lebesgue  measure. 

(6)  Obvious  modifications  in  Theorems  2.1  through  2.3  give  monotonicity  in 
the  opposite  direction. 

For  subset  selection  rules  the  expected  subset  size  has  been  used  as  a 
performance  characteristic.  We  naturally  want  to  know  the  worst  configuration 
in  the  sense  that  it  maximizes  the  expected  subset  size.  The  following  theorem 
(discussed  and  proved  without  a  formal  statement)  of  Gupta  and  Panchapakesan  [28] 
gives  a  sufficient  condition  for  the  expected  subset  size  to  be  maximized 
at  an  equi -parameter  configuration. 

Theorem  2.5.  Let  X^ ,  ...,  be  independent  random  variables,  X.  having 
|  an  absolutely  continuous  distribution  F(*,e^),  with  continuous  densities 

f(-,e.).  Let  i j'(x,e)  be  a  bounded  function  possessing  the  first  partial  deri¬ 
vatives  iji  and  \i>  with  respect  to  x  and  e  ,  respectively,  and  satisfying  the 

X  0 

i  regularity  conditions  of  Theorem  2.3.  Define 

k  k 

B(e,,  ...,  e.)  =  l  E  [  n  *(X,e  )].  Then 
1  K  i=l  ei  r=l  r 

nM 

(2.4)  B(e  |  ...<  ek)  i  B(e  |  9-| -  ...  =ek) 

provided  that,  for  all  e.  £  e.  and  a.e.x,  the  following  holds: 

*  J 

n i/i ( x , e . )  3i|»(x,  o.)  aF(x;o.) 

<2-5>  -  -inH - ;  o- 
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Remarks  2.6.  As  in  the  case  of  Theorem  2.3,  Gupta  and  Panchapakesan  [28] 
have  specialized  this  for  (i)  location  parameter,  (ii)  scale  parameter,  and 
(iii)  convex  mixtures.  For  their  class  of  procedures,  ifi(x,e^)  =  F(h(x) ) , 
i=l,  ...,  k.  For  location  and  scale  parameter  cases,  the  usual  choices  are 
h(x)  =  x  +  b,  b  >_  0,  and  h(x)  =  ax,  a  >_  1 ,  respectively.  In  these  cases,  the 
left-hand  side  of  (2.3)  is  zero  for  all  x;  thereby  showing  that  EQ[tp(X,e)] 
is  independent  of  e.  Further,  the  condition  (2.5)  in  these  cases  reduces 
to  the  monotone  likelihood  ratio  property,  a  result  directly  proved  by  Gupta [22] .  □ 
Now,  we  note  that  Theorem  2.2  is  a  simple  generalization  of  Theorem  2.1 

k 

to  IR  ,  the  k-dimenional  Euclidean  space.  We  now  consider  various  general¬ 
izations  of  the  concepts  of  stochastic  ordering  and  monotone  likelihood  ratio 
to  distributions  in  higher  dimensions.  To  this  end,  we  introduce  the  following 
definitions. 

1/ 

Definition  2.7.  A  function  <(;  defined  on  IR  is  said  to  be  increasing 
with  respect  to  a  partial  order  relation  if  x^  <x^  implies  ^(x^ )  <  ip ( x_2 ) 

k 

for  all  ><i ,  x^  €  IR  . 

k 

Definition  2.8.  A  set  S  in  IR  is  said  to  be  an  increasing  set  if  its 
indicator  function  is  increasing;  that  is,  if  x^€S  and  x^  <  x2  ,  then  x^es. 

Let  X  be  a  k-dimensional  random  vector  with  distribution  P  in  F  , 

0 

where  e  =(e, ,. . .  ,e.  ) .  Let  P.(S)  =  P0(X£S)  for  any  measurable  set  S. 

I  K  0  0  — 


Definition  2.9.  A  distribution  P0  is  said  to  have  stochastically  in- 


cr^asj nci  property  (SIP)  in  if  P  (S)  <_  P  (S)  for  every  monotone  nonde- 

H-l  2 

creasing  measurable  set  S  and  for  every  <  G 


The  following  lemma  is  due  to  Lehmann  [43]. 
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Lemma  2.10.  A  family  of  distributions  P„  has  SIP  in  e  if  and  only 

0 

if  E  *(X)  <  E.  i //(X)  for  all  nondecreasino  integrable  functions  X )  and 
— 1  —2  ' 

— 1  <  -V 

The  following  theorem  follows  easily  from  Lemma  2.10. 

Theorem  2.11.  Let  the  distribution  of  X  have  SIP  in  o  and  let 
be  nondecreasing  in  x  and  e_  .  Then  E  ij»(X_,o_)  is  nondecreasino  in  o. 
When  we  have  independence,  it  is  easily  verified  that  the  MLR  property 
implies  SIP  (Lehmann  [43]).  When  we  deal  with  correlated  random  variables 
X] ,  ...,  X  ,  it  is  natural  to  look  for  a  generalized  concept  of  MLR  in 
higher  dimensions.  For  a  density  f(x;e)  in  the  one-dimensional  case,  the 
MLR  property  says  that 

(2.6)  f  (Xi » 0 1 )  f^^)  -  f  (x^  ,62)  f  (x^e-j )  0 . 

for  every  x^  <  x^  and  e1  <_  e2-  We  can  rewrite  (2.6)  in  the  form 

(2.7)  f(x;a)  >  f ( x ;  (i,2)e) 

2 

where  f(x;e)  =  n  f(x.;e.),  0  =  (e,,e9),  and  (1,2)0^  is  the  vector  obtained 
i=l  11  1  £ 

from  e_  by  interchanging  0^  and  e^-  This  provides  the  motivation  for  the 
following  definition  of  Property  M  by  Eaton  [18]. 

Definition  2.12.  A  family  of  real  valued  density  functions 
{fa(x;0.)},  a  4 ,  is  said  to  have  Property  M  if,  for  each  a£^and  for  each 
pair  (i,j),  1  <_  i  t  j  <  k,  the  following  holds: 

(2.8) .  x.  x,  and  0.  >  0,  =*  f  (x;e)  >_  f  (x;  (i,j)ej. 

*  J  ’  J  Ct  01 

Eaton  [18]  has  given  a  necessary  and  sufficient  condition  for  a  class 
of  densities  to  possess  Property  M.  Bechhofer,  Kiefer  and  Sobel  ( [1 1 T ,  p .  41) 


in  their  monograph  on  sequential  identification  and  selection  rules  define 

a  rankability  condition  which  is  same  as  Property  M.  Hollander,  Proschan 

and  Sethuraman  [36]  have  defined  a  concept  of  decreasing  in  transposition 

(Ul)  which  is  also  same  as  Property  M;  however,  their  motivation  comes  from 

finding  classes  of  functions  which  share  certain  properties  of  Schur  func- 

2k 

tions.  In  fact,  when  q(x,jO  =  h(x-e),  g  is  DT  on  R  if  and  only  if  h 
is  Schur-concave  on  IRk. 

It  is  important  to  note  that,  unlike  in  the  case  of  one-dimensional 
distributions,  Property  M  does  not  imply  SIP.  The  following  simple  example 
of  Hsu  [37]  illustrates  this  point. 

Example  2.13.  X.  =  ( )  has  the  following  distribution  for  four 
permissible  values  of  e  =  (e^e^). 


f. 


X 

- 

(5,6) 

(6,5) 

(1,2) 

0.9 

0.1 

(2,1) 

0.1 

!  0.9 

(3,4) 

0.6 

0.4 

(4,3) 

0.4 

0.6 

Further,  we  can  have  SIP  without  Property  M;  this  is  true  in  one- 
dimension  also.  Finally,  it  is  possible  to  have  both  SIP  and  Property  M  as 
it  is  the  case  with  the  multinomial  distribution. 

Another  generalization  of  MLR  is  given  by  Gupta  and  Huang  [25]  who 
obtained  for  a  family  of  densities  having  this  generalized  MLR  property  an 
essentially  complete  class  of  multiple  decision  rules. 
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Definition  2.14.  A  probability  density  f(x;e) is  said  to  have  a 
generalized  monotone  likelihood  ratio  (GMLR)  in  >c,  if  for  every  i  and  all 
fixed  Xj,  j  =  1,  ....  k,  j  f  i,  f(x;§1  )/f (x;§2)  is  nondecreasing  in  xi ,  where 

-£  =  (0*r  •  •  •  ,0£k^  ’  *  =  2i  9ij  =  02j  for  a11  j  ^  i,and  6li  >  e2i  ‘ 

What  we  have  discussed  so  far  are  some  basic  assumptions  that  are 
usually  made  regarding  the  underlying  family,  and  the  monotonicity  behavior 
of  the  expectations  of  certain  functions.  Also  of  relevance  here  is  the  con¬ 
cept  of  stochastic  majorization  and  inequalities  obtained  by  majorization. 

One  definition  of  stochastic  majorization  is  to  say  that  )(  is  stochastically 
majorized  by  Y  if  E(i//(X))  £  E(^(Y))  for  all  Schur-convex  functions  i| >\  of  course, 
there  are  other  possible  definitions  (see  Marshall  and  01kin[47],  chapter  11). 
Majorization  techniques  can  be  used  to  show  that  E[^(X)]  £  E[^(Y)]  for  sev¬ 
eral  other  families  of  functions  ip.  The  relevance  of  these  results  to  selection 
problems  is  obvious,  when  t|/(_X)  is  the  indicator  function  of  the  event  "a 
correct  selection  is  made."  For  several  useful  inequalities  in  this  direction, 
we  refer  to  Chapters  12  and  13  of  Marshall  and  01  kin  [47]. 

3.  RESTRICTED  FAMILIES  OF  DISTRIBUTIONS 

By  restricted  families  of  distributions,  we  mean  a  family  of  distri¬ 
butions  3  each  member  of  which  is  partially  ordered  in  a  sense  with  respect 
to  a  given  distribution  G.  Such  families  do  arise  naturally  in  reliability 
studies.  More  commonly  known  families  of  this  type  are  those  with  increasing 
failure  rate  (IFR)  and  increasing  failure  rate  on  the  average  (IFRA)  and 
naturally  those  with  corresponding  decreasing  properties.  In  dealing  with 
such  classes  we  do  not  know  the  exact  forms  of  the  distributions  that  belong 
to  3,  but  we  do  know  the  nature  of  the  partial  order  relation  and  the  distri- 


bution  G.  Precisely  this  knowledge  enables  one  to  find  bounds  for  quantities 
of  interest  such  as  the  probability  of  survival  and  mean  life  in  terms  of 
G.  Inequalities  are  thus  very  important  in  reliability  studies.  As  a  matter 
of  no  surprise,  significant  contributions  to  inequalities  for  restricted 
families  have  been  made  by  researchers  in  mathematical  reliability  --  Barlow, 
Marshall  and  Proschan,  to  mention  a  few.  Typical  of  these  problems  is  the 
use  of  order  statistics.  Many  important  order  statistics  inequalities  that 
arise  in  inference  problems  of  reliability  are  reviewed  by  Gupta  and 
Panchapakesan  [29]. 

Selection  procedures  for  restricted  families  of  distributions  were 
first  studied  by  Barlow  and  Gupta  [7].  In  these  problems,  we  cannot  evaluate 
the  infimum  of  the  PCS  when  we  have  k  populations  from  3;  however,  we  can 
evaluate  a  lower  bound  for  this  infimum  in  terms  of  the  known  distribution 
G  using  probability  inequalities.  We  describe  in  this  section  such  inequalities 
and  explain  the  contexts  of  the  selection  problems.  For  purpose  of  describing 
these  results,  we  need  to  introduce  some  definitions. 

Assuming  that  all  our  distributions  are  absolutely  continuous,  we  now 
define  some  of  the  special  order  relations  of  interest  to  us.  F  and  G  denote 
distribution  functions. 

Definitions  3.1 .  (i)  F  is  said  to  be  convex  with  respect  to  (w.r.t.) 

G  (written  F  <  G)  if  and  only  if  G_1F(x)  is  convex  on  the  support  of  F. 

(ii)  F  is  star  shaped  w.r.t.  G(F  ^  G)  if  and  only  if  F ( 0 )  =  G(0)  =  0  and 
G  F(x)/x  is  increasing  in  x  ^  0  on  the  support  of  F.  (iii)  F  is  tail 
ordered  w.r.t.  G(F  G)  if  and  only  if  F(0)  =  G(0)  =  1/2,  and  G'V(x)  -  x 
is  nondecreasing  on  the  support  of  F. 

If  G(x)  =  l-e"x,  x  _>  0,  then  (i)  defines  the  class  of  IFF  distributions 
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studied  by  Barlow,  Marshall  and  Proschan  [9]  while  (ii)  defines  the  class  of 
IFRA  distributions  studied  by  Birnbaum,  Esary  and  Marshall  [14].  Convex 
ordering  was  studied  by  van  Zwet  [60].  Doksum  [17]  has  used  the  tail  ordering 
It  is  easy  to  verify  that  the  above  order  relations  are  all  partial  order 
relations.  One  can  also  easily  see  that  convex  ordering  implies  star  ordering 
Without  the  assumption  of  the  common  median  zero,  the  definition  (iii)  has 
been  used  by  Bickel  and  Lehmann  [13]  to  define  an  orderi nq  by  spread  with  the 
germinal  concept  attributed  to  Brown  and  Tukey  [15]  by  them.  This  kind  of 
ordering  has  also  been  perceived  by  Saunders  and  Moran  [56]  in  the  context  of 
a  neurobiological  problem  and  is  called  ordering  by  dispersion  by  them.  We 
now  give  a  formal  definition  below. 

Definition  3.2.  G  is  more  dispersed  than  F  (F  <  G)  if 

(3.1)  G_1  (B)  -  G-1(«)  >  F-1(e)  -  F-1(a)  for  all  0  <  «  <  8  -  1. 

By  setting  x  =  F- 1  ( t3 )  and  y  =  F_1(ot),  it  is  easy  to  see  that  (3.1)  is 

equivalent  to  saying  that  G~V(t)  -  t  is  increasing  in  t.  However,  (3.1) 

presents  the  idea  more  clearly,  that  is,  any  two  percentage  points  of  G  are 
at  least  as  far  apart  as  the  corresponding  percentage  points  of  F. 

Finally,  we  define  a  general  partial  order  relation  through  a  class  of 
real  functions  introduced  by  Gupta  and  Panchapakesan  [29]  The  star  and  tail 
orderings  can  be  obtained  as  special  cases. 

Definition  3.3.  Let  Jl  =  |h(x}|  be  a  class  of  real  valued  functions 
h(x)  defined  on  the  real  line.  Let  F  and  G  be  distributions  on  the  real  line 

such  that  F(0)  =  G(0).  We  say  that  F  is  u-  ordered  w.r.t.  G  (F  <  G)  if 

#» 

(3.2)  G_1F(h(x))  -  h(G']F(x)) 


for  all  h  €  and  all  x  on  the  support  of  F. 
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All  the  order  relations  we  have  defined  so  far  can  easily  be  verified 
to  be  partial  order  relations  in  that  they  satisfy  only  reflexivity  and 
transitivity.  It  can  be  seen  immediately  from  the  above  definition  that,  if 
Ji  =  (ax,a_>l}  and  F(0)  =  G(0)  =  0,  we  get  the  star  ordering  and  that  the 
tail  ordering  is  obtained  by  taking  H  =  (x+b,  b>0)  and  F(Q)  =  G(0)  =  1/2. 

Also,  if  we  do  not  include  F(0)  =  G(0)  in  the  definition,  then  the  dispersion 
ordering  becomes  a  special  case. 

The  next  theorem  gives  the  basic  inequality  of  Gupta  and  Panchapakesan  [29] 
and  some  related  inequalities. 

Theorem  3.4.  Let  XQ,  ,  ...,  xp(Y0>Y-|>  •••>  Yp)  be  independent  and 
identically  distributed,  each  with  distribution  function  F  (G),  and  let 
F  -<  G.  Then  the  following  inequalities  hold. 

(a)  Pr{h(XQ)  >  X.,  i=l ,  ....  p}  >  Pr{h(YQ)  >  Y . ,  i=l . p}, 

(b)  Pr(XQ  >  h(Xi ) ,  i=l ,  ...,  p}  <  Pr{YQ  >  h(Y.),  1=1,  ....  p}, 

(c)  Pr{h(XQ)  <  X.,  i*l,  ...,  p}  <  Pr{h(YQ)  <  Y.,  1=1,  ....  p}, 

(d)  Pr{XQ  <  h(X.),  1=1,  ...,  pi  >  Pr{ Yq  <  h(Y.),  1=1,  ...,  pi. 

Proof.  We  will  prove  (a).  The  other  inequalities  can  be  established 
similarly.  Let  cp  =  G  V.  Then 

Pr{h(XQ)  >  X.,  1=1,  ....  p} 

=  Pr{cp(h(Xg) )  >  9(Xi),  i=l ,  ...,  p),  since  cp  is  nondecreasinq 

Prfh(cp(Xn) )  >  cp  (  X. ) ,  i  =  l ,  . . . ,  pi,  since  F  <  6 
=  Pr{h(Yg)  >_  Y^ ,  1=1,  ...,  p},  since  cp(X.)  is  stochastically  equal  to 
Yi ,  i=0,  1 ,  . . . ,  p.  □ 

The  inequalities  (a)  through  [d)  of  the  above  theorem  can  be  re¬ 
written  respectively  as 


(3.3) 


/  Fp(h(x))  dF(x)  >  /Gp(h(x))  dG(x) , 


(3.4) 
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f  FP(h_1(x))  dF(x)  i  jG^h'^x))  dG(x) 
(3-5)  /  Cl-F(h(x) )]pdF(x)  1/  Cl-G(h(x))]pdG(x), 


and 

(3.6)  /  [1-F(h'1(x))]p  dF(x)  >/[l-G(h‘1(x))]p  dG(x), 

where  h  is  assumed  to  exist  and  the  integrals  extend  over  the  supports  of 
the  relevant  distributions.  Gupta  [23]  obtained  essentially  these  inequalities 
for  any  p  >  0  under  a  set  of  hypotheses  which  amounts  to J4 -ordering.  Also, 
in  selection  and  ranking  problems,  we  typically  get  the  probabilities, 

Pr{h(XQ)  >  X.,  i =0 ,  1,  ....  P)  and  Pr{XQ  <  h(X.),  i=0,  1,  ...,  p}. 


These  are  same  as  the  left-hand  side  probabilities  in  (a)  and  (d)  of  Theorem  3.4 
if  we  assume  that  h(x)  _>  x.  This  is  satisfied  for  natural  choices  of  h(x) 
in  the  procedures.  It  should  be  noted  that  h(x)  >  x  in  the  special  classes 
of  U  yielding  star  and  tail  ordering. 

Interesting  special  inequalities  are  obtained  by  considering  special 
pairs  of  F  and  G  in  Theorem  3.4.  We  mention  here  a  few  of  them  relevant  to 
selection  rules,  thus  generally  applying  inequalities  (a)  and  (d)  of  Theorem  3.4. 

Suppose  X]f  ...,  Xn  are  i.i.d.  with  distribution  F  and  Yj,...,  Yn  are 
i.i.d.  with  distribution  G.  Let  F  <  G.  Let  F^j  and  G^.-j  denote  the  cdf's 

of  the  jth  order  statistic  of  the  Xi  and  the  Y.  respectively.  Define 


n(x)  =  [n!/( j-1 ) ! (n-j) ! ]  ’/  uJ_1(l-u) 
J  ’  n 


du 


so  that 


<3-7>  FU]M  ■  BJ>n(F(x))  ,  Bj  nF(x). 

Since 

(3’8)  G[j]  F[j]  (x)  =  CBj,nGrlBj,nF(x)  =  G'1|r(x)’ 
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we  see  that  order  statistics  preserve  Jl-ordering.  So  we  qet 

(3.9)  J  fJ^Wx))  dF^x)  > /GPj;](h(x))  dG[;j](x) 
and 

(3.10)  /  [1-F[j](h'1(x))]P  dFm(x)  >/[l-G[j](h‘1(x))]p  dG[:J](x). 

Barlow  and  Gupta  [7]  studied  subset  selection  procedures  for  selecting 
the  distribution  with  the  largest  (smallest)  a-quantile  from  k  =  p+1  distri¬ 
butions  that  are  star  ordered  w.r.t.  6.  In  their  procedures,  h(x)  =  ax,  a  >  1. 
With  this  choice  of  h(x),  the  right-hand  sides  of  (3.9)  and  (3.10)  become 
the  infimum  of  PCS  in  these  two  cases.  Specializing  these  inequalities 
further  to  the  case  of  IFRA  distributions,  we  get  the  following  corollary. 


Corollary  3.5.  Let  Fj-jj  denote  the  cdf  of  the  jth  order  statistic  in 
a  random  sample  of  n  observations  from  an  IFRA  distribution  F.  Then 


(3.11)  /  FPj j  (ax)  dFCj](x)  >  £  G^-j  (ax)  dG^fx) 

and 


(3.12)  7  [l-Fu](f)]P  dF[JJ(«)  »  J  Cl-G(J](f)]P 


0 

where 

(3.13)  6[j-](x)  =  l  [l-e^f  e 


-x-it  Q-(n-t)x  =  d  n_e-xl 
j  »n  '' 


Barlow,  Gupta  and  Panchapakesan  [8]  have  tabulated  the  values  of  a"^ 
for  which  the  right-hand  sides  of  (3.11)  and  (3.12)  are  equal  to  P*  (the 
guaranteed  minimum  PCS)  for  selected  values  of  p,  n,  j  and  P*.  Gupta  and 
Panchapakesan  [30]  studied  a  similar  quantile  selection  procedure  for  selecting 
the  largest  quantile  for  distributions  that  are  star  ordered  w.r.t. the  standard 
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normal  distribution  folded  at  the  origin.  In  this  case,  the  inequality  (3.11) 

holds  with  Gr.-i(x)  =  B.  (24>(x)-l),  where  4>(x)  is  the  standard  normal  cdf. 

LJ  J  J  »n 

The  values  of  a"^  for  which  the  right-hand  side  of  (3.11)  is  equal  to  P*  are 
tabulated  by  Gupta  and  Panchapakesan  [30]  for  selected  values  of  p,  n,  j  and  P*. 

It  is  easy  to  verify  that  the  folded  normal  distribution  is  an  I FR  and 
therefore  an  IFRA  distribution.  So  we  can  obtain  further  inequalities  by 
taking  Fq  j  -j  ( x)  =  n(2<f>(x)-l)  in  the  above  corollary. 

We  can  get  similar  inequalities  for  F  and  G  such  that  F  <  G.  We  have 
to  take  h(x)  =  x+b,  b  >  0,  in  (3.5)  and  (3.6).  More  inequalities  can  be 
obtained  by  considering  F^  and  Gj-^-j  with  special  choices  of  G.  These  in¬ 
equalities  occur  in  selection  procedures  of  Barlow  and  Gupta  [7]  for  selection 
in  terms  of  medians  for  a  class  of  distributions  (not  defined  in  this  paper) 
and  the  procedures  of  Gupta  and  Panchapakesan  [29]  who  have  used  the  logistic 
distribution  for  G. 

Remarks  3.6  Suppose  we  take  W  =  (ax,  a  >_  1 }  in  Theorem  3.4.  Then, 

X1  Xo  X1  Xo  Y1  Yd 

letting  Z,  =  max{-~,  ...,  1  =  min  . .  U  W,  =  max  ..., 

1  *0  *0  6  *0  0  0  0 
Y1  Y 

and  W„  =  min  [  n-,  . . . ,  we  get 

c  0  T0 


(3.14) 


Pr{Z1  <  at  >  Pr{W^  <  a } , 

Pr{zl  <  l  }5 

Pr{Z2  >  a  }  <.  Pr{W2  >_  a } , 

Pr(z2  1  a  )>  prtw2  >  j  >• 


In  other  words,  we  have  inequalities  for  the  distribution  functions  (and 
hence  for  quantiles)  of  the  maximum  and  the  minimum  of  certain  correlated 
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ratios  of  variables  with  distributions  F  and  G. 

In  the  case  of  ii  =  {  x+b,  b  >_  0},we  let  Zj  =  max  {X^-Xg,  . . . ,  Xp-Xgl, 


1'2  =  min  {XrX0>  ....  Xp-XQ},  Wj  =  max  (Y^Yg,...,,  Yp-Yn}  and 
{ Y-j -  Yq,  Yp-Y0}.  Then,  we  get 


(3.15) 


< 


Pr{Z]  <  b}  >  Pr{Wj  <  b}, 
Pr{Z-j  <  -b}  <  Pr{W|  <  -b), 
Pr{Z^  >  b}  <  Pr{W^  >  b), 
PrlZ^  >  -b}  >  Pr{W^  i  -b}. 


We  will  come  back  to  these  inequalities  in  Section  4.3. 


□ 


min 


4.  INEQUALITIES  FOR  SPECIFIC  DISTRIBUTIONS 

We  are  mainly  interested  in  certain  inequalities  relating  to  multi¬ 
variate  normal,  multinomial  and  gamma  distributions  that  occur  in  ranking  and 
selection  problems.  Of  course,  these  are  of  interest  otherwise  too. 

4.1  Inequalities  for  Multivariate  Normal  Distribution.  A  probability 
expression  that  occurs  frequently  in  selection  problems  is  Pr[X^  £  ,  ..., 

1  a^]  where  X^ ,  X^,  ....  X^  are  identically  distributed  but  correlated. 

Most  familiar  of  these  and  perhaps  most  often  used  in  practice  are  the  cases 
where  X-j ,  ...,  X^  have  a  joint  k-variate  normal  and  t  distributions.  Evalu¬ 
ation  of  these  probability  integrals  are  difficult  to  accomplish  as  k  gets 
large  when  there  is  no  special  pattern  of  the  associated  covariance  matrix  i. 
In  such  cases,  inequalities  which  give  good  bounds  become  more  attractive. 
There  are  numerous  results  in  the  literature  in  this  direction.  We  will  men¬ 
tion  here  only  two  results,  namely,  those  of  Anderson  [6  ]  and  Slepian  [58]. 
For  a  detailed  account  of  these  and  other  related  inequalities  and  references. 
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the  reader  is  referred  to  the  book  of  Tong  [59]  and  the  recent  survey  paper 
of  Eaton  [19].  To  state  Anderson's  theorem,  let  us  define  a  partial 

ordering  <  for  covariance  matrices  of  the  same  order  by  t  <  z  if  z  -  H-1 
is  positive  semidefini te. 

Theorem  4. 1  (Anderson  [ 6  ]).  Let  X  =  (X^ ,  . . . ,  X^)  and  Y  =  (Y^ ,  . . . ,  Y^) 
be  k-variate  normally  distributed  random  vectors  with  common  mean  vector  zero 
and  covariance  matrices  z  and  H1  respectively  and  let  E  be  a  convex  set 
symmetric  about  the  origin.  Then  v  <  z  implies  Pr[l[€  E]  >_  Pr[j(€  E]. 

As  we  have  pointed  out  earlier,  inequalities  have  been  used  in  selec¬ 
tion  problems  typically  to  obtain  the  infimum  of  the  PCS  or  a  lower  bound  for 
it.  One  result  that  has  been  used  very  often  at  some  staqe  of  the  problem 
is  the  Slepian  inequality  stated  below. 

Theorem  4.2  (Slepian  Inequality).  If  X  =  (X^,  ...,  X^)  has  the  k- 

variate  normal  distribution  with  nonsingular  covariance  matrix  T.  =  (0  —  ),  with 
a^.  =  1,  i=l,...,k,  then  for  any  constants  Cj,...,ck,  the  probability 
Pr{X1  <_  cr...,Xk  ^  ck)  is  strictly  increasing  as  a  function  of  each  for 
i^j.  In  particular,  if  a..  >  0,  i,  j  =  l,...,k,  then 

1  J 

k 

Pr[X .  <  c.,  i  =  l , . . .  ,k]  >  n  Pr[X .  <_  c,]. 

1  -  1  i=l  1  1 

Motivated  by  a  design  problem  with  a  selection  and  ranking  goal,  Rinott 
and  Santner  [55]  obtained  an  inequality  that  combines  the  aspects  of  the  results 
of  Anderson  and  Slepian;  namely,  for  d>0 

(4.1)  //  (d+x+ay)  $m(d+x)  d$(x)  d'i>(y)  <  / rn + m (d+x)  d<t> ( x ) 

where  >t>(x)  is  the  standard  normal  cdf,  m  and  n  are  integers  such  that 
m+1  >_  n  >_  1 ,  and  all  integrals  are  from  to  It  can  also  be  shown  that 
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the  left-hand  side  of  (2.8)  is  decreasing  in  |ct|  for  any  d  >_  0. 


4.2  Inequalities  for  Multinomial  Distributions. 


Let  X  =  (X.j,  ...,  X^)  have  the  multinomial  distribution  given  by 


(4.2) 


where 


k  x. 


Pr{X  =  x  }  =  n!  n  (e.1  /  x . ! ) 

i  =  l  1  1 


x  *  (x, ,  . . . ,  x. ) ,  z  x.  =  n  and  z  e.  =  1 
1  k  i=l  1  i=l  1 


Define 


(4.3) 


C(e-| ,  .  • . ,  em)  =  Pr{X.  >  c.,  i=l ,  .  . ,  ml 


where  z  c.  _<  n  and  m  £  min(k-l,  n).  The  results  of  Alam  [  1  ]  are  summarized 
i=l  1 

in  the  following  theorem. 

Theorem  4.3  C(e^,  ....  em)  is  nondecreasing  in  ,  i=l,  2,  ....  m. 

Further,  for  c.  =  c . , 

■  J 

(4.4)  Cijt(er  ‘ em^  0m)  £Cij(0r  ••••  9m^ 

where  0^(6^,  ...»  em)  is  obtained  from  C(e-|,  ...,  0^)  by  replacing  ei  and 
with  their  average,  and  Cijt(01,  ...,  @m)  is  obtained  from  0(0^  ...»  em)  by 
substituting  t  for  e.  and  e.+e.-t  for  e.  where  0  £  t  £  min  (o.,  e). 

*  '  J  J  I  J 

Let  us  assume  here  and  in  what  follows  on  multinomial  distribu'ion  that 
01  £  e2  i  •••  £  0k'  Troni  theorem  4.3,  we  have 


(4.5) 


P  r  { X  -j  >_  c,  ...,  c  |  e  i ,  • » • ,  0  ^ ,  0*1 

1  Pr  { X-j  c ,  ...,  ^  ^  I  >  ...»  e^l 


£  Pr { X i  >  c,  ....  Xk  >  c|e,  . . . ,  0} 


where 


c  <n/k,  e*  =  l-(k-l)e1  and  5  =  £0./k. 
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Using  a  representation  of  Pr(X^  >_  c,  . . . ,  >_  c  J o-j ,  . . . ,  e^} 

in  terms  of  the  Dirichlet  integral,  the  inequalities  in  (4.5)  can  be  ob¬ 
tained  as  a  special  case  of  Theorem  1  of  Olkin  [51]  which  shows  the  Dirichlet 
integral  to  be  a  Schur  function.  More  general  results  are  available  in 
Marshall  and  Olkin  ([47],  p.  306). 

Bechhofer,  Elmaghrabi  and  Morse  [10]  considered  a  single  sample  selec¬ 
tion  procedure  to  select  the  most  probable  cell  with  a  minimum  guaranteed 
probability  P*  that  the  selected  cell  will  be  the  one  associated  with  ok 
whenever  0^/ek_^  ^  6>1.  The  rule  R  proposed  by  Bechhofer,  Elmaghrabi 
and  Morse  takes  a  sample  of  N  observations  and  selects  the  cell  that  yields 
the  largest  number  of  observations  using  randomization  to  break  ties.  The 
PCS  is  given  by 

(4.6)  PCS  =  Pr{X,  >  X.,  j/k}  +  1/2  I  Pr{Xt=X. ,  X.  >  X.,  j/i) 

k  j  i/k  K  1  k  J 

+  ...  +  1/k  Pr(Xk  =  XR1  =  ...  =  X^ 

-  'f ( 0 -j ,  ©2*  •  •  • ,  0  k ) »  say. 

The  following  result  of  Kesten  and  Morse  [41]  gives  the  LFC. 

Theorem  4,4  With  the  above  assumptions  and  notations, 

(4.7)  f(01 . ek  |  ok/0|<_i  1  5  >  1)  >  *(e* . ek) 

where  =  ...  =  ek_1  =  (6+k-l)"1  and  ek  =  sfs+k-l)'1. 

Cacoullos  and  Sobel  [16]  used  an  inverse  sampling  rule  for  the  same 
selection  problem.  Observations  are  obtained  sequentially  until  one  of  the 
k  cells  has  a  prespecified  count  N.  This  particular  cell  Is  then  identified 
as  the  most  probable  cell.  In  this  case,  the  PCS  can  be  written  as  a  Dirichlet 
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integral  and  the  LFC  is  the  same  as  that  of  the  single  sample  procedure  of 
Bechhofer,  Elmaghrabi  and  Morse  [10].  Alam  [  3  ]  considered  a  different 
stopping  rule,  namely,  the  observations  are  taken  sequentially  until  the 
difference  between  the  highest  and  the  next  highest  cell  count  is  equal  to  r. 
For  k=2, 

(4.8)  PCS  =  xr/(l+xr) 

where  x  =  o^/e-j  •  For  k>2,  there  is  no  exact  result.  Alam  [3]  gives  a 
lower  bound,  namely, 

(4.9)  PCS  >  1  -  kE1  xl  /(1<) 

i=l  1  1 

r  k  r 

where  x.  =  e./e.  ,  i=l,  ...,  k-1 .  An  improved  bound,  namely,  e.  /  i:  9.,  is 

1  1  K  K  i  \ 

recently  given  by  Levin  and  Robbins  [45]. 

Going  back  to  the  single  sample  procedure  of  Bechhofer,  Elmaghraby 
and  Morse  [10]  for  selecting  the  most  probable  cell,  the  LFC  is  sought  subject 
to  ek/ok_i  _>  <5  >  1 .  If  we  are  interested  in  selecting  the  least  probable  cell, 
then  the  analogous  problem  will  be  to  get  the  LFC  whenever  e^/e^  >_  6>1. 

The  analogous  procedure  will  select  the  cell  with  the  least  count  using  random¬ 
ization  to  break  ties.  In  this  case,  a  minimum  P*  for  the  PCS  cannot  be 
guaranteed  for  all  P*.  This  is  shown  by  Alam  and  Thompson  [5]  who  proposed 
a  modified  indifference-zone.  Their  rule  is  still  to  select  the  cell  with 
the  least  count.  Let  ^'(e^,  ...,  e^)  denote  the  PCS  for  this  rule.  Then  their 
LFC  result  can  be  stated  as  follows: 

(4.8)  4',(o1,  ....  ek|e2-e1  >c)  >  v*  (e* . o*) 

where  0  <  c  <  (k-1)"'*,  =  [1 -(k-1  )c]/k,  and  o2  =  . 


..=  o*  =  (l+c)/k. 
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We  get  additional  probability  inequalities  via  subset  selection  rules. 
Gupta  and  Nagel  [27]  discussed  single  sample  subset  slection  rules  for 
selecting  the  most  (least)  probable  cell.  If  we  denote  the  cell  counts  by 
X-| ,  ...,  ,  their  rules  R1  and  R^  for  the  most  and  the  least  probable  cell, 

respectively,  are  as  follows: 

Select  the  cell  with  count  if  and  only  if 

R1 :  Xj  _>  max(X1 ,  ....  X^)  -  d 

R^ :  X.j  <  min(X^,  ....  Xfc)  +  c 

where  c  and  d  are  nonnegative  integers  chosen  suitably  to  guarantee  the 
specified  minimum  PCS. 

The  PCS  for  R^  is  given  by 

(4.9)  P(CS  |  R-j )  =  F(k,n,d;  e] ,  ....  ek)  =  e(v1  ...  vk)  e/.-.e  k 

where  the  summation  is  over  all  k-  tuples  (v^ ,  \>k)  such  that  the  vi 

are  nonnegative,  Ev^  =  n  and  v.  <_  vk+d,  1=1 ,  ...,  k-1 .  In  the  case  of  , 
P(CS|R„)  =  G(k,n,c;e-| ,  ...,  ek)  is  given  by  the  summation  in  (4.9)  extending 
over  k-  tuples  (v^ ,  ...,  vk)  such  that  the  v.  are  nonnegative,  Evi  =  n  and 
v j  ^  —  c,  i-2,  ...,  k. 

We  now  summarize  the  inequality  results  of  Gupta  and  Nagel  [27]  in  the  following 
lemmas  and  theorems. 

Lemma  4.5  F(k,n,d;  ,  ...,  ek)  satisfies  the  following  inequalities: 

(1 )  For  1  _<  i  <  j  <  k,  and  0  <  €  <_  , 

F(k,n,d;  0  -j ,  0  k )  —  0  ^ »  •••»  0  .j  -  £ »  ...»  0  0  ^ )  ■ 

(2)  For  1  <_  i  <  k,  and  0  <  fe  ek, 

F(k,n,d;  0-j »  ...,  0  k )  —  0  -j  i  *  *  *  *  0  .j  0  k~  ^  )  • 
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It  should  be  noted  that  Lemma  4.5  is  true  even  if  the  order  is 
disturbed  in  the  configurations  on  the  right  hand  side  of  the  inequalities. 
The  next  theorem  on  the  LFC  is  a  consequence  of  Lemma  4.5. 

Theorem  4. 6  Let  r  be  the  smallest  integer  for  which  o.  >  0  and  let 

s  be  the  largest  integer  such  that  0.  <  o. .  For  a  configuration  minimizing 

J  K 

F(k,n,d;  ,  ...,  e^),  we  have  r  _>  s.  Furthermore,  if  r  =  k-1 ,  then  r  >  s. 

In  other  words.  Theorem  4.6  says  that  the  worst  configuration  is  of 
the  type  (0,  ...,  0,  «,  e,  ...,e),  a  <  e. 

Lemma  4.7  G(k,n,c;  ,  ...,  o^)  satisfies  the  following  inequalities: 

(1)  For  1  <  i  <  j  ^  k  and  0  <  €  <_  0  ., 

G(k,n,c,  ®i>  *  •  •  0  ^ •••»  0  .j -  t  >  ...,  0  j  +  £ ,  •••»  0  k )  ■ 

(2)  For  1  <  j  _<  k  and  0  <  €  <_  e., 

3 

G(k,n,c,  0  ^ »  •  •  • ,  0  k )  >_G(k,n,Cj  ...,  0  j  ,  0  k ) . 

As  in  the  case  of  Lemma  4.5,  here  also  the  statements  are  true  even  if 
the  order  is  disturbed  in  the  configuration.  The  following  fi  eorem  is  a 
consequence  of  Lemma  4.7. 

Theorem  4.8  G(k,n,c;  0-j ,  ...,  e^)  is  minimized  at  a  configuration  of 
the  type  =  ...  =  ek_-|  <  ek< 

Now,  let  us  consider  a  independent  multinomial  distributions  each  with 
k  cells.  Let  =  (0^>  . ..,  6^)  be  the  vector  of  the  cell  Drobabil ities  of 
ni ,  the  ith  distribution,,  i =1 ,  ...,  m.  We  also  assume  that,  for  each  i, 

Gil  1  1  9ik’ 

Definition  4.9.  We  say  that  0_.  majorizes  0^(0^.  >  o^)  if 
k  k 

t.  o.  >  r.  e.  for  r  =  1 ,  . . . ,  k  with  equality  holding  for  r  =  1. 

~  Ja 
a-r  a-r 
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Definition  4.10  If  a  function  cp  satisfies  the  property  that 
v(x)  *_  <p(y)  (y(x)  y(y))  whenever  x  >  y,  then  cp  is  called  a  Schur-concave 

(Schur-convex)  function. 

k 

If  o.  •  o.,  it  implies  that  H(o.)  ••  H(o.),  where  H(o.)  -  -  r.  o.  loo  ». 
1  J  «  “1  -t  1  IX 

a-  I 

is  the  Shannon  entropy  function  associated  with  ik  . 

Suppose  we  take  n  independent  observations  from  each  multinomial  distri 
bution.  Let  denote  the  number  of  outcomes  in  the  cell  with  probability 

e.  in  n.,  a  =  1,  ....  k;  i=l,  1.  Define 

l  a  i 


(4.10)  Qj(n,k,«.*,  e ^  ,  ....  e^) 


=  Pr  (<p(~£k 


max 

1  <a<l 


)-  d>,  j  =  1,  ... 


where  <p  is  a  Schur-concave  function  and  d  >  0. 

Gupta  and  Hong  [34]  investigated  a  subset  selection  rule  for  selecting 
the  population  whose  cell  probability  vector  majorizes  that  of  any  other, 
assuming  that  one  such  e: ists.  The  special  case  of  k  =  2  multinomial  distri¬ 
butions  with  the  Shannon  entropy  function  as  a  particular  choice  of  cp  was 
earlier  considered  by  Gupta  and  Huang  [24].  The  following  theorem  relates 
to  the  properties  of  the  procedure  of  Gupta  and  Wong  [34]. 


Theorem  4.11 .  If  >  e^.,  then  Q.(n,k,t;  , 

. . ,  e  f ).  Further,  if  e.  ^  e ^  for  all  j=l . i. 


•  •  >  i£)  <  Qj(n,k,n;  e_1 , 
then  Q.. (n,k,u;  ...  e^) 


Q.(n,k,?;  e  =  ...  =  e  ). 


♦ 


t  •  • 
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4.3  Inequalities  for  the  Gamma  Distribution 
Let 

(4.11) 


v(m,x)  =  /  tm  ^e_t  dt 
0 


and 

(4.12) 

Of  course, 

(4.13) 


r(m,x)  =  r(m)  -  y(m,x),  m>0. 


-t.m-l 

f(x;m)  =  — *  x  >  0,  m  >  0, 


is  the  gamma  density  where  m  is  the  shape  parameter.  For  0  <  m  <  1,  con¬ 
tinued  fraction  expansions  can  be  obtained  (see,  for  example,  Khovanskii  [42]) 
for  x  m  eX  y(m,x)  and  x  m  ex  r(m,x).  Let  Pn(m,x)/Qn(m,x)  and  P^(m,x)/Q^(m,x) 
be  the  nth  convergents  of  these  two  expansions  respectively. 

In  the  case  of  y(m,x),  Gupta  and  Waknis  [33]  obtained  the  system  of 
inequalities: 

P«("M)  v  _m  P„(m,x) 

'n'"’"'  ''n ^n+m'n+l 

where  x  <n+m  +  l  is  a  necessary  restriction  only  on  the  inequalities  on  the 
right-hand  side  of  (4.14)  and  where  (n)r  =  n(n-l)  . ..(n-r+1),  r  >  1,  and 


P  ( m , x )  „  „  P „(m,x)  n,  ,,,  , 

(4J4)  Qn(m,x)  <  6  x  Y(m,x)  <  q"(m,x)  +  Cn+m)  "(n+Hm-x)’  n  =  1,  2,  .... 


(4.15) 


Pn(m,x) 

Q^TmTxT 


[  1  + 


n-1 


1+m  (1+m)  (2+m) 


(1+m)  . fn-l+mT 


]• 


In  the  case  of  r(a,x),  the  even  order  convergents  form  a  monotonic 
increasing  sequence  and  the  odd  order  convergents  form  a  monotonic  decreasing 
sequence,  both  converging  to  ex  x-n1  r(m,x).  So  a  system  of  inequalities  can 
be  generated  by  bounding  ex  x"m  r(m,x)  by  successive  convergents.  These 
bounds  are  discussed  in  Gupta  and  Waknis  [33].  These  bounds  in  turn  can  be 
used  to  get  bounds  on  the  integrals 
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(4.16)  /  Ff)(cx;m)  f(x;m)dx 

0 

and 

oo 

(4.17)  /  [1-F(bx;m)],:)  f(x;m)dx 

0 


where  F(x;m)  is  the  cdf  of  the  gamma  distribution.  The  integrals  (4.16)  and 

(4.17)  with  c  1  and  0  <  b£  1  are  the  infima  of  the  PCS  for  the  subset 
selection  rules  of  Gupta  [21]  and  Gupta  and  Sobel  [32]. 

Now,  let  Xg,  Xj ,  ....  Xp  be  independent  identically  distributed  each 
having  a  gamma  distribution  with  density  f(x;m)  given  by  (4.13).  Let 


(4.18) 


Let  Gm(y)  and  Hm(y)  denote  the  cdf's  of  Z-|  and  Z 2,  respectively.  We  note 
that  the  integrals  in  (4.16)  and  (4.17)  are  Gm(c)  and  1 -Hm(b ) ,  respectively. 
Alam  [2]  proved  that,  for  m  >  1,  Hm(y)  is  increasing  in  m  for  y  >  1  and  is 
decreasing  in  m  for  y  <  1.  Alam's  proof  involves  a  fair  amount  of  analytical 


details.  Further,  Alam  has  no  comment  on  the  behavior  of  G  (y).  The  following 

mw 

theorem  provides  validity  of  Alam's  result  for  m  >  0  and  establishes  the 


monotonicity  behavior  of  G  and  H  for  a  larger  class  of  distributions. 

mm 


Theorem  4.12.  Let  Xg,  X^ ,  ...»  Xp  be  i.i.d.  nonnegative  random  variables 
each  having  the  distribution  F  ,  where  {F  }  is  a  star-preceding  family  in  UA 

A  A 

[i.e.,  F^  <  Fx  for  x^  <x2].  Let  G^  and  Hx  be  the  cdf's  of  Z1  and  Z 2  defined 

in  (4.18).  Then  G  (y)  and  H  (y)  are  both  increasing  in  x  for  y  >  1  and 

A  A 

decreasing  in  x  for  y  <  1. 
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Proof.  Since  F,  <  F,  for  A,  <  \n,  the  conclusions  of  the  theorem 
-  x2  *  Ai  1  2 

follow  immediately  from  the  inequalities  (3.14)  of  Remarks  3.6.  □ 


Remarks  4.13.  In  the  case  of  the  qamma  family  {F  },  it  is  known 
-  ■/  m 

that  Fm  convex  precedes  in  m  >  0;  see  van  Zwet  [60],  p.  60.  Since  the  con¬ 
vex  ordering  implies  the  star  ordering,  Alam's  result  readily  follows  from 
Theorem  4.12.  As  we  pointed  out  earlier,  in  subset  selection  procedures,  we 
typically  encounter  Gm(y)  for  y  <  1  and  Hm(y)  for  y  >  1.  That  the  monotonicity 
properties  of  Gm(y)  and  Hm(y)  in  these  cases  can  be  established  by  the  star¬ 
ordering  property  of  the  gamma  distribution  was  known  though  not  formally 
demonstrated;  see  McDonald  [48]  and  Panchapakesan  [53]  who  have  given  different 
alternative  proofs  in  the  case  of  integral  m  for  p  =  1  and  p  >_  1  respectively. 
Finally,  the  monotonicity  property  of  Hm(y )  is  applied  to  evaluate  the  infimum 
of  the  PCS  for  the  inverse  sampling  procedure  of  Cacoullos  and  Sobel  [16] 
for  selecting  the  most  probable  multinomial  cell.  □ 


For  the  Gamma  distribution  with  density  in  (4.13),  let  fm(a)  and  fm(f) 
denote  the  ath  and  the  8th  quantiles,  where  0  <  a  <  8  <  1 .  For  m^  <  m2,  as 


pointed  out  earlier. 


This  is  equivalent  to 


in  other  words,  £m(e)  /  £m(a)  decreases  in  m,  a  result  obtained  by  Saunders 
and  Moran  [56]  using  a  fairly  long  direct  method.  They  have  also  shown  that, 
for  m^  <  m2>  Fm  is  more  dispersed  than  Fm  ;  in  other  words,  r_m ( f-i )  -  f|n(«0 

increases  in  m.  Also,  we  can  now  apply  the  inequalities  in  (3.15)  to  obtain 
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new  inequalities  for  the  distribution  functions  of  the  maximum  and  the  minimum 
of  certain  correlated  differences. 


4.4  Inequalities  Arising  From  A  Two  Stage  Selection  Procedure. 

Gupta  and  Miescke  [26]  studied  sequential  selection  procedures  with 

elimination  which  are  based  on  vector-at-a-time  sampling.  They  showed  that 

the  'natural'  terminal  decisions  are  optimum  in  a  fairly  decision- theoretic 

sense.  To  decribe  the  inequalities  that  are  obtained,  let  n, ,  . . . ,  nk  be  k 

independent  populations  with  densities  f.  ,  8.  €  fi,  with  respect  to  the 

ei  1 

Lebesgue  measure  on  the  real  line  IR  or  any  counting  measure  on  a  lattice  in 
IR  ,  where  3  =  {fn},  e  £  fi,  is  a  one-parameter  exponential  family.  Let 

u 

Xil’  Xi2’  **'  be  indePendent  observations  from  ,  i=l,  ...,  k.  For  fixed 
n  <  m,  let  Uj  »  +  ...  +  X-n,  V.  =  X^n+1  +  ...  +  X.  ^  and  W.  =  +  V. , 

i  =  l,  ....  k.  Further,  for  fixed  sc  (1,  ...,  k),  permutation  symmetric 


Borel  set  A  <-  IR  ,  and  i  £  s,  define 


(4.20) 


Pe  {Vi 


pe  {Wi 


max  V.}, 
j€s  J 


max  W.  |  (U., 
j£  s  J 


.  Uk)€A  } 


Theorem  4.13  For  s  =  (i,,  ...,  i_} 

-  1  m 

(1)  o.  <  e.  implies  that  r.  <  r  and  q.  £  q.  ,  j =  1 ,  . . . ,  m;  j  f  * ,  and 

j  f  j  s  j  t 


(2)  the  vector  r  =  (r .,...,  r.  )  majorizes  the  vector  g  =  (q.  ,  ...,  q,  ). 

71  ’m  1  m 
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